Tensorflow Tutorial (writen by experience)

Alex Fang

1.Recurrent neural network

RNNäøŽLSTMåŸŗęœ¬äøŠę˜Æäø€ę ·ēš„ļ¼ŒLSTMåŖę˜Æå¤šäŗ†é—åæ˜é—Øļ¼Œęœ‰äŗ†é•æēŸ­ęœŸč®°åæ†ēš„ęœŗåˆ¶ļ¼Œä¼šęÆ”RNNę›“å„½äø€äŗ›ć€‚ēŽ°åœØę–‡ē« äø­ęåˆ°ēš„RNNéƒ½ę˜Æē‰¹ęŒ‡LSTMć€‚å¦‚ęžœę˜ÆNLPé¢†åŸŸļ¼Œå°±ę›“ę˜Æē‰¹ęŒ‡åÆ¹čÆå‘é‡åŠ äŗ†attentionēš„LSTMć€‚å¹¶äø”ļ¼ŒåœØtensorflow中,LSTMäøŽRNNåœØē¼–ēØ‹äøŠēš„å·®č·åŖåœØäø€č”Œä»£ē ēš„é•æåŗ¦ć€‚å› ę­¤ļ¼ŒåŽē»­ęˆ‘åšäŗ†3äøŖåŸŗäŗŽLSTMēš„å®žéŖŒć€‚

RNN.png

RNN2.png

LSTM.png

Basic example: sin(x)-cos(x)

In [2]:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
In [9]:
steps = np.linspace(0, np.pi*2, 100, dtype=np.float32)
x_np = np.sin(steps)
y_np = np.cos(steps)
plt.plot(steps, y_np, 'r-', label='target (cos)')
plt.plot(steps, x_np, 'b-', label='input (sin)')
plt.legend(loc='best')
plt.show()
In [4]:
TIME_STEP = 10       # rnn time step
INPUT_SIZE = 1      # rnn input size
CELL_SIZE = 32      # rnn cell size
LR = 0.02           # learning rate
In [5]:
tf.reset_default_graph()
tf_x = tf.placeholder(tf.float32, [None, TIME_STEP, INPUT_SIZE])
tf_y = tf.placeholder(tf.float32, [None, TIME_STEP, INPUT_SIZE])

#it's really important to form the habit to use name_scope, especially when you use lstm and rnn!!!

#with tf.name_scope("first lstm")
rnn_cell = tf.contrib.rnn.BasicLSTMCell(CELL_SIZE, forget_bias=0.2, state_is_tuple=True)
init_s = rnn_cell.zero_state(batch_size=1, dtype=tf.float32)    # very first hidden state
outputs, final_s = tf.nn.dynamic_rnn(
    rnn_cell,                   # cell you have chosen
    tf_x,                       # input
    initial_state=init_s,       # the initial hidden state
    time_major=False,           # False: (batch, time step, input); True: (time step, batch, input)
)

outs2D = tf.reshape(outputs, [-1, CELL_SIZE])                       # reshape 3D output to 2D for fully connected layer
net_outs2D = tf.layers.dense(outs2D, INPUT_SIZE)
outs = tf.reshape(net_outs2D, [-1, TIME_STEP, INPUT_SIZE])          # reshape back to 3D

loss = tf.losses.mean_squared_error(labels=tf_y, predictions=outs)  # compute cost
train_op = tf.train.AdamOptimizer(LR).minimize(loss)

sess = tf.Session()
sess.run(tf.global_variables_initializer())     # initialize var in graph

los=[]
y_real=[]
y_pred=[]
for step in range(200):
    
    start, end = step * np.pi, (step+1)*np.pi   # time range
    # use sin predicts cos
    steps = np.linspace(start, end, TIME_STEP)
    x = np.sin(steps)[np.newaxis, :, np.newaxis]    # shape (batch, time_step, input_size)
    y = np.cos(steps)[np.newaxis, :, np.newaxis]
    
    if 'final_s_' not in globals():                 # first state, no any hidden state
        feed_dict = {tf_x: x, tf_y: y}
    else:                                           # has hidden state, so pass it to rnn
        feed_dict = {tf_x: x, tf_y: y, init_s: final_s_}
    _,los_, pred_, final_s_ = sess.run([train_op,loss, outs, final_s], feed_dict)     # train
    if step%10==0:
        print(los_)
        los.append(los_)
    
    pred_=pred_.reshape(-1,1)
    for item in range(len(pred_)):
        y_pred.append(pred_[item][0])
    

plt.plot(range(len(los)),los)
plt.xlabel("Iterations")
plt.ylabel("Loss")
0.53770435
0.25929925
0.110445336
0.017763395
0.014540821
0.0010775086
0.003960844
0.0026764194
0.00094303937
0.00065490545
0.0013096994
0.0015922571
0.0011566079
0.00065911596
0.0021370626
0.017605636
0.078477025
0.01864163
0.030451793
0.016260507
Out[5]:
Text(0, 0.5, 'Loss')

mass3.gif

Real example: PM2.5 prediction

In [55]:
import warnings
warnings.filterwarnings('ignore')
import tensorflow as tf
import pandas as pd
import numpy as np
import time
import xgboost as xgb
np.random.seed(1)
import os
import sys
from sklearn.preprocessing import MinMaxScaler
#os.environ['CUDA_VISIBLE_DEVICES'] = '0'
In [50]:
#Table for the last experiment.
class Generate_data:
        
    def __init__(self,problem_type,i,path,key):
        
        raw_data=pd.read_csv(path)
        raw_data=pd.DataFrame(raw_data[key])
        raw_data=raw_data.dropna()
        raw_data=list(raw_data[key])
        raw_data=raw_data[8000*i:8000*(i+1)]
        mm = MinMaxScaler()
        x = mm.fit(np.array(raw_data[0:6400]).reshape(-1,1))
        raw_data = x.transform(np.array(raw_data).reshape(-1,1))
        raw_data = list(raw_data)
        if len(raw_data)%2==0:
            self.raw_data=raw_data
        else:
            self.raw_data=raw_data[1:]
        
        
    def SMA(self,lag1,lag2,look):
        
        data=self.raw_data
        for j in range(len(lag1)):
            l1=lag1[j]
            l2=lag2[j]
    
            x=np.full([len(data)-200,l2],np.nan)
            x2=np.full([len(data)-200,30],np.nan)
            y=np.full([len(data)-200,1],np.nan)
            y2=np.full([len(data)-200,1],np.nan)
            for i in range(100,len(data)-100):
                x[i-100,:]=data[i-l2:i]
                x2[i-100,:]=data[i-30:i]
                y2[i-100,:]=data[i-1+look]
                if data[i-1+look]-data[i-1]>=0:
                    y[i-100,:]=int(1)
                else:
                    y[i-100,:]=int(0)
            
            point=int(x.shape[0]*0.8)
            x_train=x[:point,:]
            x_test=x[point:,:]
            x_train2=x2[:point,:]
            x_test2=x2[point:,:]
            label_train=y[:point,:]
            label_test=y[point:,:]
            regression_train=y2[:point,:]
            regression_test=y2[point:,:]
            
            y_train=np.full([x_train.shape[0],2],np.nan)
            for i in range(x_train.shape[0]):
                y_train[i,0]=np.mean(x_train2[i,30-l2:])-np.mean(x_train2[i,30-l1:])
                y_train[i,1]=np.mean(x_train2[i,30-l2:-1])-np.mean(x_train2[i,30-l1:-1])
    
            y_test=np.full([x_test.shape[0],2],np.nan)
            for i in range(x_test.shape[0]):
                y_test[i,0]=np.mean(x_test2[i,30-l2:])-np.mean(x_test2[i,30-l1:])
                y_test[i,1]=np.mean(x_test2[i,30-l2:-1])-np.mean(x_test2[i,30-l1:-1])
            
            setattr(self,'sma_x_train'+str(j),x_train2)
            setattr(self,'sma_y_train'+str(j),y_train)
            setattr(self,'sma_x_test'+str(j),x_test2)
            setattr(self,'sma_y_test'+str(j),y_test)
            setattr(self,'label_train',label_train)
            setattr(self,'label_test',label_test)
            setattr(self,'regression_train',regression_train)
            setattr(self,'regression_test',regression_test)
    
    
    def RSI(self,lag_rsi,look):
        
        data=self.raw_data
        for j in range(len(lag_rsi)):
            l1=lag_rsi[j]
            x=np.full([len(data)-200,l1],np.nan)
            x2=np.full([len(data)-200,10],np.nan)
            y=np.full([len(data)-200,1],np.nan)
            
            for i in range(100,len(data)-100):
                x[i-100,:]=data[i-l1:i]
                x2[i-100,:]=data[i-10:i]
                if data[i-1+look]-data[i-1]>=0:
                    y[i-100,:]=int(1)
                else:
                    y[i-100,:]=int(0)
            
            point=int(x.shape[0]*0.8)
            x_train=x[:point,:]
            x_test=x[point:,:]
            x_train2=x2[:point,:]
            x_test2=x2[point:,:]
            label_train=y[:point,:]
            label_test=y[point:,:]
            
            y_train=np.full([x_train.shape[0],2],np.nan)
            for i in range(x_train.shape[0]):
                series=list(x_train[i,:])
                series=pd.Series(series,index=range(len(series)))
                d1_series=series-series.shift(1)
                d1_series=d1_series.dropna()
                d1_series=np.array(d1_series)
                up_list=d1_series[d1_series>=0]
                down_list=d1_series[d1_series<0]
                up=np.mean(up_list)
                down=abs(np.mean(down_list))
                ans=(up/down)/(1+(up/down))
                if np.isnan(ans):
                    ans=0
                y_train[i,0]=ans
    
            y_test=np.full([x_test.shape[0],2],np.nan)
            for i in range(x_test.shape[0]):
                series=list(x_test[i,:])
                series=pd.Series(series,index=range(len(series)))
                d1_series=series-series.shift(1)
                d1_series=d1_series.dropna()
                d1_series=np.array(d1_series)
                up_list=d1_series[d1_series>=0]
                down_list=d1_series[d1_series<0]
                up=np.mean(up_list)
                down=abs(np.mean(down_list))
                ans=(up/down)/(1+(up/down))
                if np.isnan(ans):
                    ans=0
                y_test[i,0]=ans
                
            setattr(self,'rsi_x_train'+str(j),x_train)
            setattr(self,'rsi_y_train'+str(j),y_train)
            setattr(self,'rsi_x_test'+str(j),x_test)
            setattr(self,'rsi_y_test'+str(j),y_test)
            setattr(self,'rsi_label_train',label_train)
            setattr(self,'rsi_label_test',label_test)
In [58]:
class Baseline:
    
    def __init__(self):
        pass
    
    def LSTM(self,data):
        
        tf.reset_default_graph()
        X_train=solve.sma_x_train0[-1500:,:]
        X_test=solve.sma_x_test0[:1500,:]
        Y_train=solve.regression_train[-1500:,:].reshape(-1,1)
        Y_test=solve.regression_test[:1500,:].reshape(-1,1)
        label_train=solve.label_train[-1500:,:].reshape(-1,1)
        label_test=solve.label_test[:1500,:].reshape(-1,1)
        count=0
        
        TIME_STEP = X_test.shape[1]
        INPUT_SIZE = 1 
        CELL_SIZE = 50
        
        tf_x = tf.placeholder(tf.float32, [None, TIME_STEP, INPUT_SIZE]) 
        tf_y = tf.placeholder(tf.int32, [None, INPUT_SIZE])
        tf_is_training = tf.placeholder(tf.bool, None)
        
        # LSTM
        sess = tf.Session()
        #rnn header
        rnn_cell_1 = tf.contrib.rnn.BasicLSTMCell(num_units=CELL_SIZE)
        init_s_1 = rnn_cell_1.zero_state(batch_size=X_train.shape[0], dtype=tf.float32)
        outputs_1, final_s_1 = tf.nn.dynamic_rnn(
            rnn_cell_1,                   # cell you have chosen
            tf_x,                       # input
            initial_state=init_s_1,     # the initial hidden state
            time_major=False,          # False: (batch, time step, input); True: (time step, batch, input)
        )
        
        outs2D_1 = tf.reshape(outputs_1, [-1, CELL_SIZE])                       
        net_outs2D_1 = tf.layers.dense(outs2D_1, INPUT_SIZE)
        outs_1 = tf.reshape(net_outs2D_1, [-1, TIME_STEP, INPUT_SIZE])
        outs_1 =  tf.reshape(outs_1, [-1,TIME_STEP])           
        out1 = tf.layers.dense(outs_1, 1)
        loss1 = tf.losses.mean_squared_error(labels=tf_y, predictions=out1) 
        optimizer1 = tf.train.AdamOptimizer(learning_rate=0.0002,beta1=0.9,beta2=0.999)
        #optimizer = tf.train.MomentumOptimizer(learning_rate=0.005,momentum=0.9)
        train_op1 = optimizer1.minimize(loss1)
        sess.run(tf.global_variables_initializer())
        
        for step in range(1000):
            if step==0:                 
                feed_dict = {tf_x: X_train.reshape(-1,TIME_STEP,1),tf_y: Y_train.reshape(-1,INPUT_SIZE),tf_is_training: True}
            else:                                    
                feed_dict = {tf_x: X_train.reshape(-1,TIME_STEP,1),tf_y: Y_train.reshape(-1,INPUT_SIZE),init_s_1: final_s_,tf_is_training: True}
            
            _, pred_, los,final_s_= sess.run([train_op1, out1, loss1,final_s_1], feed_dict)
            if step%50==0:
                print(los)
                
        feed_dict = {tf_x: X_test.reshape(-1,TIME_STEP,1),tf_y: Y_test.reshape(-1,INPUT_SIZE),init_s_1: final_s_,tf_is_training: False}
        _, pred_, los,final_s_= sess.run([train_op1, out1, loss1,final_s_1], feed_dict)
        print('Test los '+str(los))
        for i in range(len(pred_)):
            if (pred_[i,0]-X_test[i,-1]>0.5)&(label_test[i,0]==1):
                count+=1
            elif (pred_[i,0]-X_test[i,-1]<=0.5)&(label_test[i,0]==0):
                count+=1
            
        acc=count/len(pred_)
        return acc
        
    
    def EMA(self,data):
        
        alpha=0.1
        X=solve.sma_x_test0
        Y=solve.label_test
        count=0
        
        for j in range(len(X)):
            
            temp=X[j,:]
            s=[]
            for i in range(len(temp)):
                s.append(temp[i])
            
            s_temp=[]
            s_temp.append(s[0])
            for i in range(len(s)):
                s_temp.append(alpha * s[i-1] + (1 - alpha) * s_temp[i-1])
                
            pred=s_temp[-1]-s[-1]
            if (pred>0)&(Y[j,0]==1):
                count+=1
            elif (pred<=0)&(Y[j,0]==0):
                count+=1
        
        acc=count/len(X)
        return acc
In [57]:
PATH=["PRSA_data_2010.1.1-2014.12.31.csv","/home/fangjie/Reverse/SZ_index.csv","/home/fangjie/Reverse/HS300_index.csv","/home/fangjie/Reverse/Industry_index.csv"]
KEY=["pm2.5","close","close","close"]
#we do test on the train data to find a good but not optimal prior knowledge
SMA_pk=[[5,25],[15,25],[3,7],[3,7]]
RSI_pk=[[10,30],[10,30],[10,30],[10,30]]

path1=PATH[0]
key1=KEY[0]
solve=Generate_data("classification",0,path1,key1)
solve.SMA(lag1=[SMA_pk[0][0]],lag2=[SMA_pk[0][1]],look=1)
solve.RSI(lag_rsi=[RSI_pk[0][1],RSI_pk[0][0]],look=1)
In [59]:
baseline=Baseline()
base2=baseline.LSTM(solve)
#baseline
baseline=Baseline()
base1=baseline.EMA(solve)
base2=baseline.LSTM(solve)
print("The classification accuracy for EMA is %s"%str(base1))
print("The classification accuracy for LSTM is %s"%str(base2))
0.00058703625
1.2635114e-05
2.756896e-06
1.2203041e-06
6.5266596e-07
4.5367457e-07
3.9154523e-07
3.6287136e-07
3.4023674e-07
3.1913422e-07
2.9917345e-07
2.8037715e-07
2.6274208e-07
2.4622454e-07
2.3075917e-07
2.1627126e-07
2.0268222e-07
1.8991682e-07
1.7790559e-07
1.6658649e-07
Test los 8.512103e-07
4.3289638e-06
5.2512767e-08
1.8329056e-08
9.083306e-09
5.4513367e-09
3.731956e-09
2.7858222e-09
2.1907038e-09
1.7791765e-09
1.4783945e-09
1.2519921e-09
1.0785354e-09
9.43746e-10
8.379064e-10
7.54064e-10
6.8719336e-10
6.335503e-10
5.9023225e-10
5.549884e-10
5.26014e-10
Test los 1.991732e-09
The classification accuracy for EMA is 0.49615384615384617
The classification accuracy for LSTM is 0.462

Real example: Stock price prediction

In [60]:
PATH=["PRSA_data_2010.1.1-2014.12.31.csv","/home/fangjie/Reverse/SZ_index.csv","HS300_index.csv","/home/fangjie/Reverse/Industry_index.csv"]
KEY=["pm2.5","close","close","close"]
#we do test on the train data to find a good but not optimal prior knowledge
SMA_pk=[[5,25],[15,25],[3,7],[3,7]]
RSI_pk=[[10,30],[10,30],[10,30],[10,30]]

path1=PATH[2]
key1=KEY[2]
solve=Generate_data("classification",0,path1,key1)
solve.SMA(lag1=[SMA_pk[2][0]],lag2=[SMA_pk[2][1]],look=1)
solve.RSI(lag_rsi=[RSI_pk[2][1],RSI_pk[2][0]],look=1)
In [61]:
baseline=Baseline()
base2=baseline.LSTM(solve)
#baseline
baseline=Baseline()
base1=baseline.EMA(solve)
base2=baseline.LSTM(solve)
print("The classification accuracy for EMA is %s"%str(base1))
print("The classification accuracy for LSTM is %s"%str(base2))
0.0003931609
4.066331e-05
8.7670805e-06
1.543587e-06
9.159164e-06
9.764499e-07
3.8328767e-07
2.8661879e-07
2.5401908e-07
2.2924007e-07
2.063949e-07
1.8465504e-07
1.6410573e-07
1.4492623e-07
1.2725073e-07
1.1117077e-07
9.670846e-08
8.381878e-08
7.239905e-08
6.2320666e-08
Test los 1.8901674e-05
3.3713718e-06
7.979546e-05
2.4395142e-06
2.3246007e-06
2.221885e-06
2.1083458e-06
1.9951553e-06
1.8891606e-06
1.7896208e-06
1.6888804e-06
1.5693462e-06
1.3985082e-06
1.1265101e-06
7.0131074e-07
2.7687614e-07
1.9036217e-07
1.6656918e-07
1.4944075e-07
1.3601547e-07
1.2468186e-07
Test los 0.00038643257
The classification accuracy for EMA is 0.41794871794871796
The classification accuracy for LSTM is 0.49666666666666665

Other people's experiment: %E4%B8%8B%E8%BD%BD.png

Maybe, we should get rid of the noise burried in data. Or we can try to use other structure. For example, dual attention mechanism + LSTM + Prior Knowledge.

Almost every man who have interest both in stocks and deep learning, they will firstly try LSTM. However, almost all frineds of mine (tsinghua), all the talent people I have heared about, they thought LSTM is useless for stock. First, it only try to regress the price, not to tell the tendency. You can look closely on the picture above, it only learns the t-1. Secondly, its regress performance may be better than SMA, however, it is almost the same with exponential moving average in this problem.

If you want to make huge imporvements in deep learning and fiannce, forget LSTM. At least before my document's publication. why? Because in this moment, LSTM's structure doesn;t work. And so many talent people have tried it. It will cost much more time态talent to make a success in this field.